Assumptions of Decision-Making Models in AGI
نویسندگان
چکیده
This paper analyzes the assumptions of the decision making models in the context of artificial general intelligence (AGI). It is argued that the traditional approaches, exemplified by decision theory and reinforcement learning, are inappropriate for AGI, because their fundamental assumptions on available knowledge and resource cannot be satisfied here. The decision making process in the AGI system NARS is introduced and compared with the traditional approaches. It is concluded that realistic decision-making models must acknowledge the insufficiency of knowledge and resources, and make assumptions accordingly. 1 Formalizing decision-making An AGI system needs to make decisions from time to time. To achieve its goals, the system must execute certain operations, which are chosen from all possible operations, according to the system’s beliefs on the relations between the operations and the goals, as well as their applicability to the current situation. On this topic, the dominating normative model is decision theory [12, 3]. According to this model, “decision making” means to choose one action from a finite set of actions that is applicable at the current state. Each action leads to some consequent states according to a probability distribution, and each consequent state is associated with a utility value. The rational choice is the action that has the maximum expected utility (MEU). When the decision extends from single actions to action sequences, it is often formalized as a Markov decision process (MDP), where the utility function is replaced by a reward value at each state, and the optimal policy, as a collection of decisions, is the one that achieves the maximum expected total reward (usually with a discount for future rewards) in the process. In AI, the best-known approach toward solving this problem is reinforcement learning [4, 16], which uses various algorithms to approach the optimal policy. Decision theory and reinforcement learning have been widely considered as setting the theoretical foundation of AI research [11], and the recent progress in deep learning [9] is increasing the popularity of these models. In the current AGI research, an influential model in this tradition is AIXI [2], in which reinforcement learning is combined with Solomonoff induction [15] to provide the probability values according to algorithmic complexity of the hypotheses used in prediction. 2 P. Wang and P. Hammer Every formal model is based on some fundamental assumptions to encapsulate certain beliefs about the process to be modeled, so as to provide a coherent foundation for the conclusions derived in the model, and also to set restrictions on the situations where the model can be legally applied. In the following, four major assumptions of the above models are summarized. The assumption on task: The task of “decision making” is to select the best action from all applicable actions at each state of the process. The assumption on belief: The selection is based on the system’s beliefs about the actions, represented as probability distributions among their consequent states. The assumption on desire: The selection is guided by the system’s desires measured by a (utility or reward) value function defined on states, and the best action is the one that with the maximum expectation. The assumption on budget: The system can afford the computational resources demanded by the selection algorithm. There are many situations where the above assumptions can be reasonably accepted, and the corresponding models have been successfully applied [11, 9]. However, there are reasons to argue that artificial general intelligence (AGI) is not such a field, and there are non-trivial issues on each of the four assumptions. Issues on task: For a general-purpose system, it is unrealistic to assume that at any state all the applicable actions are explicitly listed. Actually, in human decision making the evaluation-choice step is often far less significant than diagnosis or design [8]. Though in principle it is reasonable to assume the system’s actions are recursively composed of a set of basic operations, decision makings often do not happen at the level of basic operations, but at the level of composed actions, where there are usually infinite possibilities. So decision making is often not about selection, but selective composition. Issues on belief: For a given action, the system’s beliefs about its possible consequences are not necessarily specified as a probability distribution among following states. Actions often have unanticipated consequences, and even the beliefs about the anticipated consequences usually do not fully specify a “state” of the environment or the system itself. Furthermore, the system’s beliefs about the consequences may be implicitly inconsistent, so does not correspond to a probability distribution. Issues on desire: Since an AGI system typically has multiple goals with conflicting demands, usually no uniform value function can evaluate all actions with respect to all goals within limited time. Furthermore, the goals in an AGI system change over time, and it is unrealistic to expect such a function to be defined on all future states. How desirable a situation is should be taken as part of the problem to be solved, rather than as a given. Issues on budget: An AGI is often expected to handle unanticipated problems in real time with various time requirements. In such a situation, even if the decision-making algorithms are considered as of “tractable” computational complexity, they may still fail to satisfy the requirement on response time in the given situation. Assumptions of Decision-Making Models in AGI 3 None of the above issues is completely unknown, and various attempts have been proposed to extend the traditional models [13, 22, 1], though none of them has rejected the four assumptions altogether. Instead, a typical attitude is to take decision theory and reinforcement learning as idealized models for the actual AGI systems to approximate, as well as to be evaluated accordingly [6]. What this paper explores is the possibility of establishing normative models of decision making without accepting any of the above four assumptions. In the following, such a model is introduced, then compared with the classical models. 2 Decision making in NARS The decision-making model to be introduced comes from the NARS project [17, 18, 20]. The objective of this project is to build an AGI in the framework of a reasoning system. Decision making is an important function of the system, though it is not carried out by a separate algorithm or module, but tightly interwoven with other functions, such as reasoning and learning. Limited by the paper length, the following description only briefly covers the aspects of NARS that are directly related to the current discussion. NARS is designed according to the theory that “intelligence” is the ability for a system to be adaptive while working with insufficient knowledge and resources, that is, the system must depend on finite processing capability, make real-time responses, open to unanticipated problems and events, and learn from its experience. Under this condition, it is impossible for the truth-value of beliefs of the system to be defined either in the model-theoretic style as the extent of agreement with the state of affairs, or in the proof-theoretic style as the extent of agreement with the given axioms. Instead, it is defined as the extent of agreement with the available evidence collected from the system’s experience. Formally, for a given statement S, the amount of its positive evidence and negative evidence are defined in an idealized situation and measured by amounts w and w−, respectively, and the total amount evidence is w = w + w−. The truth-value of S is a pair of real numbers, 〈f, c〉, where f , frequency, is w/w so in [0, 1], and c, confidence, is w/(w + 1) so in (0, 1). Therefore a belief has a form of “S〈f, c〉”. As the content of belief, statement S is a sentence in a formal language Narsese. Each statement expresses a relation among a few concepts. For the current discussion, it is enough to know that a statement may have various internal structures for different types of conceptual relation, and can contain other statements as components. In particular, implication statement P ⇒ Q and equivalence statement P ⇔ Q express “If P then Q” and “P if and only if Q”, respectively, where P and Q are statements themselves. As a reasoning system, NARS can carry out three types of inference tasks: Judgment. A judgment also has the form of “S〈f, c〉”, and represents a piece of new experience to be absorbed into the system’s beliefs. Besides adding it into memory, the system may also use it to revise or update the previous beliefs on statement S, as well as to derive new conclusions using various inference rules (including deduction, induction, abduction, analogy, etc.). Each 4 P. Wang and P. Hammer rule uses a truth-value function to calculate the truth-value of the conclusion according to the evidence provided by the premises. For example, the deduction rule can take P 〈f1, c1〉 and P ⇒ Q 〈f2, c2〉 to derive Q〈f, c〉, where 〈f, c〉 is calculated from 〈f1, c1〉 and 〈f2, c2〉 by the truth-value function for deduction. There is also a revision rule that merges distinct bodies of evidence on the same statement to produce more confident judgments. Question. A question has the form of “S?”, and represents a request for the system to find the truth-value of S according to its current beliefs. A question may contain variables to be instantiated. Besides looking in the memory for a matching belief, the system may also use the inference rules backwards to generate derived questions, whose answers will lead to answers of the original question. For example, from question Q? and belief P ⇒ Q 〈f, c〉, a new question P? can be proposed by the deduction rule. When there are multiple candidate answers, a choice rule is used to find the best answer among them, based on truth-value, simplicity, and so on. Goal. A goal has the form of “S!”. Similar to logic programming [5], in NARS certain concepts are given a procedural interpretation, so a goal is taken as a statement to be achieved, and an operation as a statement that can be achieved by an executable routine. The processing of a goal also includes backward inference guided by beliefs that generates derived goals. For example, from goal Q! and belief P ⇒ Q 〈f, c〉, a new goal P ! can be proposed by the deduction rule. If the content of a goal corresponds to an executable operation, the associated routine is invoked to directly realize the goal, like what a Prolog built-in predicate does. Under the restriction of the available knowledge and resources, no task can be accomplished perfectly. Instead, what the system attempts is to accomplish them as much as allowed by its available knowledge and resources. In NARS, decision making is most directly related to the processing of goals, though the other inference activities are also relevant. In Narsese, an operation is expressed by an operator (which identifies the associated routine) with an argument list (which includes both input and output arguments). The belief about the execution condition and consequence of an operation is typically represented as “(condition, operation) ⇒ consequence”, which is logically equivalent to “condition ⇒ (operation ⇒ consequence)”. This belief can be used in different ways. In an idealized situation (where the uncertainty of the belief and the existence of other beliefs and tasks are ignored), if “condition” is true, the execution of “operation” will make “consequence” true by forward inference; when “consequence!” is a goal, backward inference will 3 Since P and Q can be events with an occurence time, the same rules can be used for temporal reasoning, which is described in more detail in [21]. 4 Different types of inference tasks may work together. For example, from important judgments of low confidence, questions can be derived, and from certain questions, goals can be derived, which if pursued give rise to curious and exploratory behaviors. 5 Like other beliefs, there is a truth-value attached, which is omitted here to simplify
منابع مشابه
Futurology of Multi-Criteria Decision Making Techniques Using Philosophical Assumptions of Paradigms in Scenario Writing
There are many opportunities and threats in the decision-making environment for managers, and an organization must use research and information systems to change, monitor, and anticipate this environment. Futurism reflects how tomorrow reality gives birth to tomorrow's reality is. The purpose of this research; Analyzing the role of futures studies in the existing patterns of critical factors of...
متن کاملاقتصاد رفتاری و سیاستگذاری عمومی
Mainstream economics is based on some assumptions and axioms which enable its theories and models to analyze the surrounding events in the simplest possible form. However, analyzing these assumptions and axioms demonstrates that many of them and their suggested models are baseless in the real world or in terms of the experimental studies. It has made the mainstream economics approach very ...
متن کاملکاربرد متدولوژی ترکیبی تحلیل پوششی دادهها و ماتریس درجه ترجیح در ارزیابی واحدهای تصمیم گیری با رویکرد فازی
Data envelopment analysis (DEA) has been a very popular method for measuring and benchmarking relative efficiency of peer decision making units (DMUs) with multiple input and outputs. Traditional data envelopment analysis (DEA) models require crisp input and output data. In real world situations, however, crisp input and output data may not always be available, especially when a set of decision...
متن کاملA hybrid decision making system using DEA and fuzzy models for supplier selection in the presence of multiple decision makers
متن کامل
Providing a behavioral model of mental accounting decision-making based on psychological components through structural equations
Thinking and thinking skills are among the important issues that have long occupied the minds of thinkers. Thinking is one of the basic issues of education that requires tools to cultivate it, one of these expressions is having a philosophical mind that makes people Helps in correct and logical thinking. Data were collected through interviews with 15 experts in the field of research. The method...
متن کامل